Caching Patterns in System Design

📚 Core Concepts

Cache Hit vs Cache Miss

Term	Definition	Speed	What Happens
Cache Hit ✅	Requested data found in cache	Fast	Data retrieved directly from cache
Cache Miss ❌	Requested data not in cache	Slow	Must fetch from main storage, then optionally cache it

Analogy

Think of cache as your desk drawer and main memory as a library shelf:

Cache hit: Find your notebook in the desk drawer → instant access ⚡
Cache miss: Not in drawer → walk to library shelf → slower retrieval 🐌

🔄 Caching Patterns

1. Cache-Aside (Lazy Loading)

┌─────────────┐
│ Application │
└──────┬──────┘
       │
       ▼
  Check Cache?
   ├─ Hit → Return
   └─ Miss → Fetch DB → Store in Cache → Return

Characteristics:

Application manages cache explicitly
Cache populated on-demand
Most common pattern

Pros:

✅ Only caches what's actually needed
✅ Cache failure doesn't break the system
✅ Flexible - app has full control

Cons:

❌ First request always slow (cache miss)
❌ Extra code in application layer
❌ Potential for stale data

When to Use:

Read-heavy workloads
When you want fine-grained control
General-purpose caching (Redis, Memcached)

Real-World Example: E-commerce product catalog caching

2. Read-Through

┌─────────────┐
│ Application │ ──Read──> ┌───────┐
└─────────────┘           │ Cache │ ──Auto Fetch──> Database
                          └───────┘

Characteristics:

Cache layer handles data loading
Transparent to application
Cache acts as abstraction over DB

Pros:

✅ Simpler application code
✅ Centralized cache logic
✅ Consistent read interface

Cons:

❌ First request still slow
❌ Adds complexity to cache layer
❌ Tighter coupling with cache system

When to Use:

When you want cache to own data loading
Frameworks that support it (Ehcache, Caffeine)
Microservices with dedicated cache service

Difference from Cache-Aside: Cache handles DB fetch vs application handles it

3. Write-Through

┌─────────────┐
│ Application │ ──Write──┬──> Cache (sync)
└─────────────┘          └──> Database (sync)

Characteristics:

Every write goes to both cache and DB
Synchronous double-write
Strong consistency guaranteed

Pros:

✅ Cache always fresh and consistent
✅ No risk of stale reads
✅ Simple consistency model

Cons:

❌ Higher write latency (two operations)
❌ Wasted writes for rarely-read data
❌ Cache can fill with cold data

When to Use:

Strong consistency requirements
Read-after-write scenarios common
Financial systems, user profiles

Real-World Example: User session data, account balances

4. Write-Behind / Write-Back

┌─────────────┐
│ Application │ ──Write──> Cache (fast return) ~~async batch~~> Database
└─────────────┘

Characteristics:

Writes happen to cache only
DB updated later in batches
Eventually consistent

Pros:

✅ Extremely fast writes
✅ Can batch/coalesce multiple writes
✅ Reduces DB load significantly

Cons:

❌ Risk of data loss if cache fails
❌ Complexity in failure handling
❌ Eventual consistency only

When to Use:

High write throughput needed
Acceptable to lose recent writes on failure
Analytics pipelines, logging systems

Real-World Example: Page view counters, metrics aggregation

5. Write-Around

Write: Application ──────────> Database (bypass cache)
Read:  Application ──> Cache? ──Miss──> Database ──> Cache

Characteristics:

Writes skip the cache entirely
Cache populated only on reads
Prevents cache pollution

Pros:

✅ Avoids caching rarely-read data
✅ Keeps cache focused on hot data
✅ Better cache hit ratio for actual reads

Cons:

❌ First read after write always misses
❌ Higher latency for read-after-write
❌ Not ideal for write-then-read patterns

When to Use:

High write volume, low read volume
Write-once-read-never scenarios
Log ingestion, data warehousing

Real-World Example: Event logging, audit trails

6. Refresh-Ahead (Proactive Caching)

Cache monitors TTL ──> Preemptively refreshes BEFORE expiration

Characteristics:

Predictive cache warming
Reduces cache misses for hot data
Requires usage pattern prediction

Pros:

✅ Minimizes cache misses
✅ Consistent low latency
✅ Great for predictable access patterns

Cons:

❌ Wastes resources on cold data
❌ Complex implementation
❌ Needs good prediction algorithm

When to Use:

Frequently accessed data with predictable patterns
Low-latency requirements (gaming, trading)
Content delivery networks (CDN)

Real-World Example: Homepage content, trending articles

7. TTL (Time-to-Live) Based

Cache Entry [Created] ──(time passes)──> [TTL Expires] ──> Auto-removed

Characteristics:

Time-based expiration
Simplest invalidation strategy
Combined with other patterns

Pros:

✅ Simple to implement
✅ Prevents indefinitely stale data
✅ Works with any caching pattern

Cons:

❌ Can cause cache miss storms at expiration
❌ Arbitrary time selection
❌ May evict still-valid data

When to Use:

Data with known freshness requirements
Combined with most caching strategies
Session tokens, temporary data

Real-World Example: API rate limiting, JWT tokens

🗑️ Eviction Policies

LRU (Least Recently Used)

Strategy: Evicts items not accessed recently
Best for: Temporal locality (recently used = likely to be used again)
Example: Web page caching

LFU (Least Frequently Used)

Strategy: Evicts items accessed least often
Best for: Popular content, frequency-based access
Example: Video streaming platforms

FIFO (First In First Out)

Strategy: Evicts oldest entries
Best for: Simple queue-like behavior
Example: Basic message queues

Random Replacement

Strategy: Evicts random entries
Best for: When no clear pattern exists, lowest overhead
Example: Simple distributed caches

📊 Pattern Comparison Matrix

Pattern	Write Speed	Read Speed	Consistency	Complexity	Data Loss Risk
Cache-Aside	🟡 Medium	🟢 Fast*	🟡 Eventual	🟢 Low	🟢 Low
Read-Through	🟡 Medium	🟢 Fast*	🟡 Eventual	🟡 Medium	🟢 Low
Write-Through	🔴 Slow	🟢 Very Fast	🟢 Strong	🟢 Low	🟢 None
Write-Back	🟢 Very Fast	🟢 Very Fast	🟡 Eventual	🔴 High	🔴 High
Write-Around	🟢 Fast	🟡 Medium	🟡 Eventual	🟢 Low	🟢 None
Refresh-Ahead	🟡 Medium	🟢 Very Fast	🟡 Eventual	🔴 High	🟢 Low

*After initial cache miss

🎯 Common Pattern Combinations

High-Traffic Web Application

Read Strategy: Cache-Aside + LRU eviction
Write Strategy: Write-Through for critical data
TTL: 5-15 minutes for most content
Tools: Redis, Memcached

Analytics Pipeline

Read Strategy: Read-Through
Write Strategy: Write-Back (batch inserts)
Eviction: LFU (frequently queried reports)
Tools: Apache Ignite, Hazelcast

E-commerce Product Catalog

Read Strategy: Cache-Aside + Refresh-Ahead for bestsellers
Write Strategy: Write-Around for inventory updates
TTL: 1 hour for product details
Tools: Redis with pub/sub for invalidation

Read Strategy: Cache-Aside + Refresh-Ahead for active users
Write Strategy: Write-Back for likes/views
TTL: 30 seconds for feed items
Eviction: LRU
Tools: Redis Cluster

🌳 Decision Tree

┌─ Need strong consistency?
│   ├─ YES → Write-Through
│   └─ NO ↓
│
├─ High write volume?
│   ├─ YES ↓
│   │   ├─ Can tolerate data loss?
│   │   │   ├─ YES → Write-Back
│   │   │   └─ NO → Write-Around
│   └─ NO → Cache-Aside
│
├─ Need ultra-low read latency?
│   └─ Add Refresh-Ahead
│
└─ Cache filling up?
    └─ Choose eviction:
        ├─ Temporal patterns → LRU
        └─ Popularity-based → LFU

✅ Best Practices

Start with Cache-Aside
- Most flexible and widely understood
- Easy to debug and reason about
Always Set TTL
- Even with other invalidation strategies
- Prevents unbounded cache growth
Monitor Cache Hit Ratio
- Aim for >80% for effectiveness
- Alert on sudden drops
Handle Cache Failures Gracefully
- App should work even if cache is down
- Implement circuit breakers
Use Appropriate Serialization
- Consider Protobuf/MessagePack over JSON
- Faster and more compact
Warm Critical Caches on Startup
- Don't wait for cold starts
- Pre-populate frequently accessed data
Implement Cache Stampede Protection
- Use locks/semaphores for cache misses
- Prevent thundering herd
Size Your Cache Appropriately
- Monitor eviction rates
- Balance memory cost vs hit rate

⚡ Performance Tips

Batch operations when possible (especially with Write-Back)
Use pipeline/multi-get for multiple keys (Redis MGET, MSET)
Consider cache-aside for writes even with read-through for reads
Implement circuit breakers for cache failures
Use connection pooling for cache clients
Monitor P99 latencies, not just averages
Compress large values before caching
Use appropriate data structures (Redis Hashes, Sets, Sorted Sets)

⚠️ Common Pitfalls

❌ Cache Stampede

Problem: Multiple requests reload same expired data simultaneously

Solution:

Locking mechanisms (distributed locks)
Early recomputation (refresh before expiry)
Probabilistic early expiration

❌ Stale Data

Problem: Cache inconsistent with database

Solution:

Proper TTL settings
Invalidation on writes
Event-driven cache updates

❌ Cache Pollution

Problem: Rarely-used data fills cache

Solution:

Write-Around pattern
Better eviction policies (LRU/LFU)
Cache only frequently accessed data

❌ Over-caching

Problem: Caching everything indiscriminately

Solution:

Profile and measure what to cache
Cache only expensive queries
Monitor cache hit rates per key pattern

❌ No Monitoring

Problem: Not knowing hit rates, evictions, or issues

Solution:

Implement comprehensive metrics
Dashboard for cache health
Alerts for anomalies

❌ Ignoring Cache Warm-up

Problem: Cold start causes poor initial performance

Solution:

Pre-populate cache on deployment
Gradual traffic ramping
Keep cache instances alive during deployments

📈 Key Metrics to Monitor

Metric	What It Measures	Target
Hit Rate	% of requests served from cache	>80%
Miss Rate	% of requests requiring DB fetch	`<20%`
Eviction Rate	How often data is removed	Low & stable
Memory Usage	Cache memory consumption	`<80%` capacity
Latency (P50, P99)	Response time distribution	`<10ms` P99
Throughput	Operations per second	Application dependent
Connection Pool	Active connections	Stable
Error Rate	Failed cache operations	`<0.1%`

🛠️ Popular Cache Technologies

In-Memory Caches

Redis - Feature-rich, supports data structures, persistence
Memcached - Simple, fast, lightweight
Hazelcast - Distributed, Java-based, compute capabilities

Application-Level Caches

Caffeine - High-performance Java cache library
Ehcache - Java cache with disk persistence
Guava Cache - Simple in-process cache for Java

CDN/Edge Caches

CloudFlare - Global CDN with edge caching
AWS CloudFront - Integrated with AWS services
Fastly - Real-time CDN with VCL customization

Distributed Caches

Apache Ignite - Distributed database and cache
Aerospike - High-performance distributed cache
Couchbase - Document DB with built-in caching

📚 Further Reading

Redis Documentation: https://redis.io/docs/
Memcached Wiki: https://github.com/memcached/memcached/wiki
AWS Caching Best Practices: https://aws.amazon.com/caching/
Martin Fowler on Caching: https://martinfowler.com/
Google SRE Book - Caching: https://sre.google/sre-book/
Designing Data-Intensive Applications by Martin Kleppmann (Chapter 3)

🎓 Quick Reference Cheat Sheet

┌─────────────────────────────────────────────────┐
│ WHEN TO USE WHICH PATTERN                       │
├─────────────────────────────────────────────────┤
│ Read-Heavy + Control      → Cache-Aside         │
│ Strong Consistency        → Write-Through       │
│ High Write Throughput     → Write-Back          │
│ Rarely Read After Write   → Write-Around        │
│ Predictable Hot Data      → Refresh-Ahead       │
│ Time-Sensitive Data       → TTL-Based           │
└─────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────┐
│ EVICTION POLICY SELECTION                       │
├─────────────────────────────────────────────────┤
│ Recent = Relevant         → LRU                 │
│ Frequency Matters         → LFU                 │
│ Simple Queue              → FIFO                │
│ No Pattern / Testing      → Random              │
└─────────────────────────────────────────────────┘

📚 Core Concepts​

Cache Hit vs Cache Miss​

Analogy​

🔄 Caching Patterns​

1. Cache-Aside (Lazy Loading)​

2. Read-Through​

3. Write-Through​

4. Write-Behind / Write-Back​

5. Write-Around​

6. Refresh-Ahead (Proactive Caching)​

7. TTL (Time-to-Live) Based​

🗑️ Eviction Policies​

LRU (Least Recently Used)​

LFU (Least Frequently Used)​

FIFO (First In First Out)​

Random Replacement​

📊 Pattern Comparison Matrix​

🎯 Common Pattern Combinations​

High-Traffic Web Application​

Analytics Pipeline​

E-commerce Product Catalog​

Social Media Feed​

🌳 Decision Tree​

✅ Best Practices​

⚡ Performance Tips​

⚠️ Common Pitfalls​

❌ Cache Stampede​

❌ Stale Data​

❌ Cache Pollution​

❌ Over-caching​

❌ No Monitoring​

❌ Ignoring Cache Warm-up​

📈 Key Metrics to Monitor​

🛠️ Popular Cache Technologies​

In-Memory Caches​

Application-Level Caches​

CDN/Edge Caches​

Distributed Caches​

📚 Further Reading​

🎓 Quick Reference Cheat Sheet​

📚 Core Concepts

Cache Hit vs Cache Miss

Analogy

🔄 Caching Patterns

1. Cache-Aside (Lazy Loading)

2. Read-Through

3. Write-Through

4. Write-Behind / Write-Back

5. Write-Around

6. Refresh-Ahead (Proactive Caching)

7. TTL (Time-to-Live) Based

🗑️ Eviction Policies

LRU (Least Recently Used)

LFU (Least Frequently Used)

FIFO (First In First Out)

Random Replacement

📊 Pattern Comparison Matrix

🎯 Common Pattern Combinations

High-Traffic Web Application

Analytics Pipeline

E-commerce Product Catalog

Social Media Feed

🌳 Decision Tree

✅ Best Practices

⚡ Performance Tips

⚠️ Common Pitfalls

❌ Cache Stampede

❌ Stale Data

❌ Cache Pollution

❌ Over-caching

❌ No Monitoring

❌ Ignoring Cache Warm-up

📈 Key Metrics to Monitor

🛠️ Popular Cache Technologies

In-Memory Caches

Application-Level Caches

CDN/Edge Caches

Distributed Caches

📚 Further Reading

🎓 Quick Reference Cheat Sheet